Supplementary of Multi-scale Deep Learning Architectures for Person Re-identification
نویسندگان
چکیده
Multi-scale-A layer (Fig. 1), analyses the data stream with the size 1 × 1, 3 × 3 and 5 × 5 of receptive field. Furthermore, in order to increase both depth and width of this layer, we split the filter size of 5 × 5 into two 3 × 3 streams cascaded (i.e. stream-4 and stream-3 in Tab 1 and Fig. 1). The weights of each stream are also tied with the corresponding stream in another branch. Such a design art is, in general, inspired by, and yet different from the inception architectures [11, 12, 10]. The key difference lies in the weights which are not tied between any two streams from the same branch, but are tied between the two corresponding streams of different branches. Reduction layer (Fig. 2) further passes the data stream in multi-scale, and halves the width and height of feature maps, which should be, in principle, reduced from 78× 28 to 39 × 14. We thus employ Reduction layer to gradually decrease the size of feature representations as illustrated in Tab 1 and Fig. 2, in order to avoid representation bottlenecks. Here we follow the design principle of “avoid representational bottlenecks” [12]. In contrast to directly use max-pooling layer for decreasing feature map size, our ablation study shows that the Reduction layer, if replaced by max-pooling layer, will leads to more than 10% absolute points lower than the reported results of Rank-1 accuracy on CUHK01 dataset. Again, the weights of each filter here are also tied for paired streams. Multi-scale-B layer (Fig. 3) serves as the last stage of highlevel features extraction for the multiple scales of 1 × 1, 3 × 3 and 5 × 5. Besides splitting the 5 × 5 stream into two 3 × 3 streams cascaded (i.e. stream-4 and stream-3 in Tab 1 and Fig. 3). We can further decompose the 3 × 3 C-filters into one 1 × 3 C-filter followed by 3 × 1 C-filter [10]. This leads to several benefits, including reducing the computation cost on 3 × 3 C-filters, further increasing the
منابع مشابه
Multi-Channel Pyramid Person Matching Network for Person Re-Identification
In this work, we present a Multi-Channel deep convolutional Pyramid Person Matching Network (MC-PPMN) based on the combination of the semantic-components and the colortexture distributions to address the problem of person reidentification. In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid ...
متن کاملPerson Re-identification: Past, Present and Future
Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use ...
متن کاملMulti-pseudo Regularized Label for Generated Samples in Person Re-Identification
Sufficient training data is normally required to train deeply learned models. However, the number of pedestrian images per ID in person re-identification (re-ID) datasets is usually limited, since manually annotations are required for multiple camera views. To produce more data for training deeply learned models, generative adversarial network (GAN) can be leveraged to generate samples for pers...
متن کاملTransferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification
Most existing person re-identification (re-id) methods require supervised model learning from a separate large set of pairwise labelled training data for every single camera pair. This significantly limits their scalability and usability in real-world large scale deployments with the need for performing re-id across many camera views. To address this scalability problem, we develop a novel deep...
متن کاملTracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking
Current multi-person localisation and tracking systems have an over reliance on the use of appearance models for target re-identification and almost no approaches employ a complete deep learning solution for both objectives. We present a novel, complete deep learning framework for multi-person localisation and tracking. In this context we first introduce a light weight sequential Generative Adv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017